Overview

Brought to you by YData

Dataset statistics

Number of variables20
Number of observations357234
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory265.4 MiB
Average record size in memory779.1 B

Variable types

Numeric8
Text4
DateTime1
Categorical7

Alerts

Anno is highly overall correlated with CRASH_UNIT_ID and 1 other fieldsHigh correlation
CRASH_UNIT_ID is highly overall correlated with Anno and 1 other fieldsHigh correlation
MANEUVER is highly overall correlated with UNIT_TYPEHigh correlation
UNIT_TYPE is highly overall correlated with MANEUVERHigh correlation
VEHICLE_ID is highly overall correlated with Anno and 1 other fieldsHigh correlation
UNIT_TYPE is highly imbalanced (94.8%) Imbalance
VEHICLE_DEFECT is highly imbalanced (76.0%) Imbalance
VEHICLE_TYPE is highly imbalanced (58.8%) Imbalance
VEHICLE_USE is highly imbalanced (67.7%) Imbalance
VEHICLE_YEAR is highly skewed (γ1 = 42.64083) Skewed
CRASH_UNIT_ID has unique values Unique
VEHICLE_ID has unique values Unique

Reproduction

Analysis started2024-11-05 17:05:25.549537
Analysis finished2024-11-05 17:05:51.690947
Duration26.14 seconds
Software versionydata-profiling vv4.12.0
Download configurationconfig.json

Variables

CRASH_UNIT_ID
Real number (ℝ)

High correlation  Unique 

Distinct357234
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean276828.55
Minimum2
Maximum561564
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.5 MiB
2024-11-05T18:05:51.808373image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile27192.65
Q1136450.25
median274486.5
Q3416253.75
95-th percentile532521.35
Maximum561564
Range561562
Interquartile range (IQR)279803.5

Descriptive statistics

Standard deviation162004.35
Coefficient of variation (CV)0.58521549
Kurtosis-1.199653
Mean276828.55
Median Absolute Deviation (MAD)139857
Skewness0.034817138
Sum9.889257 × 1010
Variance2.6245411 × 1010
MonotonicityNot monotonic
2024-11-05T18:05:51.952775image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
481322 1
 
< 0.1%
561563 1
 
< 0.1%
40727 1
 
< 0.1%
394635 1
 
< 0.1%
394634 1
 
< 0.1%
7739 1
 
< 0.1%
4918 1
 
< 0.1%
4917 1
 
< 0.1%
86 1
 
< 0.1%
58 1
 
< 0.1%
Other values (357224) 357224
> 99.9%
ValueCountFrequency (%)
2 1
< 0.1%
3 1
< 0.1%
7 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
11 1
< 0.1%
12 1
< 0.1%
13 1
< 0.1%
14 1
< 0.1%
15 1
< 0.1%
ValueCountFrequency (%)
561564 1
< 0.1%
561563 1
< 0.1%
561547 1
< 0.1%
561546 1
< 0.1%
561542 1
< 0.1%
561541 1
< 0.1%
561540 1
< 0.1%
561532 1
< 0.1%
561529 1
< 0.1%
561528 1
< 0.1%

RD_NO
Text

Distinct215177
Distinct (%)60.2%
Missing0
Missing (%)0.0%
Memory size24.9 MiB
2024-11-05T18:05:52.385502image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters2857872
Distinct characters33
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique79079 ?
Unique (%)22.1%

Sample

1st rowJC113627
2nd rowJC113627
3rd rowJC113637
4th rowJC113637
5th rowJC113630
ValueCountFrequency (%)
ja522872 9
 
< 0.1%
jb571997 8
 
< 0.1%
jb174902 8
 
< 0.1%
jb210669 7
 
< 0.1%
ja364229 7
 
< 0.1%
ja518406 6
 
< 0.1%
hz306708 6
 
< 0.1%
jb398994 6
 
< 0.1%
jb249939 6
 
< 0.1%
jb557525 6
 
< 0.1%
Other values (215167) 357165
> 99.9%
2024-11-05T18:05:52.923896image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
J 283307
9.9%
4 269493
9.4%
5 250026
8.7%
1 249051
8.7%
3 246175
8.6%
2 244561
8.6%
0 183448
 
6.4%
6 180385
 
6.3%
7 174482
 
6.1%
9 172916
 
6.1%
Other values (23) 604028
21.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2857872
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
J 283307
9.9%
4 269493
9.4%
5 250026
8.7%
1 249051
8.7%
3 246175
8.6%
2 244561
8.6%
0 183448
 
6.4%
6 180385
 
6.3%
7 174482
 
6.1%
9 172916
 
6.1%
Other values (23) 604028
21.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2857872
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
J 283307
9.9%
4 269493
9.4%
5 250026
8.7%
1 249051
8.7%
3 246175
8.6%
2 244561
8.6%
0 183448
 
6.4%
6 180385
 
6.3%
7 174482
 
6.1%
9 172916
 
6.1%
Other values (23) 604028
21.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2857872
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
J 283307
9.9%
4 269493
9.4%
5 250026
8.7%
1 249051
8.7%
3 246175
8.6%
2 244561
8.6%
0 183448
 
6.4%
6 180385
 
6.3%
7 174482
 
6.1%
9 172916
 
6.1%
Other values (23) 604028
21.1%
Distinct149378
Distinct (%)41.8%
Missing0
Missing (%)0.0%
Memory size5.5 MiB
Minimum2014-01-18 18:14:00
Maximum2019-01-11 23:36:00
2024-11-05T18:05:53.090871image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:53.250805image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

UNIT_NO
Real number (ℝ)

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.5158859
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.5 MiB
2024-11-05T18:05:53.379630image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile2
Maximum9
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.55405008
Coefficient of variation (CV)0.36549589
Kurtosis0.98450568
Mean1.5158859
Median Absolute Deviation (MAD)0
Skewness0.63476065
Sum541526
Variance0.30697149
MonotonicityNot monotonic
2024-11-05T18:05:53.499434image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
1 181687
50.9%
2 167949
47.0%
3 6681
 
1.9%
4 751
 
0.2%
5 126
 
< 0.1%
6 25
 
< 0.1%
7 8
 
< 0.1%
8 5
 
< 0.1%
9 2
 
< 0.1%
ValueCountFrequency (%)
1 181687
50.9%
2 167949
47.0%
3 6681
 
1.9%
4 751
 
0.2%
5 126
 
< 0.1%
6 25
 
< 0.1%
7 8
 
< 0.1%
8 5
 
< 0.1%
9 2
 
< 0.1%
ValueCountFrequency (%)
9 2
 
< 0.1%
8 5
 
< 0.1%
7 8
 
< 0.1%
6 25
 
< 0.1%
5 126
 
< 0.1%
4 751
 
0.2%
3 6681
 
1.9%
2 167949
47.0%
1 181687
50.9%

UNIT_TYPE
Categorical

High correlation  Imbalance 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.2 MiB
DRIVER
352571 
PARKED
 
4461
DRIVERLESS
 
181
NON-CONTACT VEHICLE
 
21

Length

Max length19
Median length6
Mean length6.0027909
Min length6

Characters and Unicode

Total characters2144401
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDRIVER
2nd rowDRIVER
3rd rowDRIVER
4th rowDRIVER
5th rowDRIVER

Common Values

ValueCountFrequency (%)
DRIVER 352571
98.7%
PARKED 4461
 
1.2%
DRIVERLESS 181
 
0.1%
NON-CONTACT VEHICLE 21
 
< 0.1%

Length

2024-11-05T18:05:53.799360image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-05T18:05:53.926707image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
driver 352571
98.7%
parked 4461
 
1.2%
driverless 181
 
0.1%
non-contact 21
 
< 0.1%
vehicle 21
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
R 709965
33.1%
E 357436
16.7%
D 357213
16.7%
I 352773
16.5%
V 352773
16.5%
A 4482
 
0.2%
P 4461
 
0.2%
K 4461
 
0.2%
S 362
 
< 0.1%
L 202
 
< 0.1%
Other values (7) 273
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2144401
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
R 709965
33.1%
E 357436
16.7%
D 357213
16.7%
I 352773
16.5%
V 352773
16.5%
A 4482
 
0.2%
P 4461
 
0.2%
K 4461
 
0.2%
S 362
 
< 0.1%
L 202
 
< 0.1%
Other values (7) 273
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2144401
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
R 709965
33.1%
E 357436
16.7%
D 357213
16.7%
I 352773
16.5%
V 352773
16.5%
A 4482
 
0.2%
P 4461
 
0.2%
K 4461
 
0.2%
S 362
 
< 0.1%
L 202
 
< 0.1%
Other values (7) 273
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2144401
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
R 709965
33.1%
E 357436
16.7%
D 357213
16.7%
I 352773
16.5%
V 352773
16.5%
A 4482
 
0.2%
P 4461
 
0.2%
K 4461
 
0.2%
S 362
 
< 0.1%
L 202
 
< 0.1%
Other values (7) 273
 
< 0.1%

VEHICLE_ID
Real number (ℝ)

High correlation  Unique 

Distinct357234
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean267320.61
Minimum2
Maximum535741
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.5 MiB
2024-11-05T18:05:54.061080image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile26253.3
Q1136218.25
median266644.5
Q3399975.75
95-th percentile508502.35
Maximum535741
Range535739
Interquartile range (IQR)263757.5

Descriptive statistics

Standard deviation154190.82
Coefficient of variation (CV)0.57680107
Kurtosis-1.1870718
Mean267320.61
Median Absolute Deviation (MAD)131893
Skewness0.0020802487
Sum9.5496012 × 1010
Variance2.3774808 × 1010
MonotonicityNot monotonic
2024-11-05T18:05:54.220907image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
460661 1
 
< 0.1%
535738 1
 
< 0.1%
39360 1
 
< 0.1%
379612 1
 
< 0.1%
379611 1
 
< 0.1%
7376 1
 
< 0.1%
4682 1
 
< 0.1%
4677 1
 
< 0.1%
83 1
 
< 0.1%
59 1
 
< 0.1%
Other values (357224) 357224
> 99.9%
ValueCountFrequency (%)
2 1
< 0.1%
3 1
< 0.1%
7 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
11 1
< 0.1%
12 1
< 0.1%
13 1
< 0.1%
14 1
< 0.1%
15 1
< 0.1%
ValueCountFrequency (%)
535741 1
< 0.1%
535738 1
< 0.1%
535725 1
< 0.1%
535723 1
< 0.1%
535718 1
< 0.1%
535717 1
< 0.1%
535714 1
< 0.1%
535710 1
< 0.1%
535709 1
< 0.1%
535706 1
< 0.1%

MAKE
Text

Distinct578
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size25.6 MiB
2024-11-05T18:05:54.627198image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length60
Median length53
Mean length10.015421
Min length2

Characters and Unicode

Total characters3577849
Distinct characters42
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique241 ?
Unique (%)0.1%

Sample

1st rowTOYOTA MOTOR COMPANY, LTD.
2nd rowFORD
3rd rowCHEVROLET
4th rowJEEP
5th rowJEEP
ValueCountFrequency (%)
motor 49014
 
8.8%
ltd 47247
 
8.5%
company 47148
 
8.5%
toyota 47137
 
8.5%
chevrolet 45230
 
8.1%
ford 39557
 
7.1%
nissan 31844
 
5.7%
honda 29325
 
5.3%
corp 17965
 
3.2%
dodge 17705
 
3.2%
Other values (915) 185104
33.2%
2024-11-05T18:05:55.213051image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
O 463297
 
12.9%
T 286728
 
8.0%
A 256850
 
7.2%
N 234418
 
6.6%
R 233021
 
6.5%
E 218213
 
6.1%
200042
 
5.6%
D 191875
 
5.4%
C 172025
 
4.8%
L 154434
 
4.3%
Other values (32) 1166946
32.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 3577849
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
O 463297
 
12.9%
T 286728
 
8.0%
A 256850
 
7.2%
N 234418
 
6.6%
R 233021
 
6.5%
E 218213
 
6.1%
200042
 
5.6%
D 191875
 
5.4%
C 172025
 
4.8%
L 154434
 
4.3%
Other values (32) 1166946
32.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 3577849
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
O 463297
 
12.9%
T 286728
 
8.0%
A 256850
 
7.2%
N 234418
 
6.6%
R 233021
 
6.5%
E 218213
 
6.1%
200042
 
5.6%
D 191875
 
5.4%
C 172025
 
4.8%
L 154434
 
4.3%
Other values (32) 1166946
32.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 3577849
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
O 463297
 
12.9%
T 286728
 
8.0%
A 256850
 
7.2%
N 234418
 
6.6%
R 233021
 
6.5%
E 218213
 
6.1%
200042
 
5.6%
D 191875
 
5.4%
C 172025
 
4.8%
L 154434
 
4.3%
Other values (32) 1166946
32.6%

MODEL
Text

Distinct1477
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size25.2 MiB
2024-11-05T18:05:55.523915image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length60
Median length56
Mean length8.9797864
Min length2

Characters and Unicode

Total characters3207885
Distinct characters73
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique363 ?
Unique (%)0.1%

Sample

1st rowHighlander(beginning vehicle year 2001)
2nd rowEXPLORER
3rd rowMALIBU (CHEVELLE)
4th rowLAREDO
5th rowLiberty
ValueCountFrequency (%)
unknown 52231
 
10.2%
nissan 17263
 
3.4%
camry 15186
 
3.0%
altima 9087
 
1.8%
corolla 8374
 
1.6%
civic 7656
 
1.5%
accord 7298
 
1.4%
sport 7232
 
1.4%
chevelle 7182
 
1.4%
malibu 7178
 
1.4%
Other values (1771) 373721
72.9%
2024-11-05T18:05:56.002659image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 282577
 
8.8%
A 247338
 
7.7%
E 179344
 
5.6%
R 175207
 
5.5%
O 164779
 
5.1%
155174
 
4.8%
C 134495
 
4.2%
U 122341
 
3.8%
S 120315
 
3.8%
I 106935
 
3.3%
Other values (63) 1519380
47.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 3207885
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N 282577
 
8.8%
A 247338
 
7.7%
E 179344
 
5.6%
R 175207
 
5.5%
O 164779
 
5.1%
155174
 
4.8%
C 134495
 
4.2%
U 122341
 
3.8%
S 120315
 
3.8%
I 106935
 
3.3%
Other values (63) 1519380
47.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 3207885
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N 282577
 
8.8%
A 247338
 
7.7%
E 179344
 
5.6%
R 175207
 
5.5%
O 164779
 
5.1%
155174
 
4.8%
C 134495
 
4.2%
U 122341
 
3.8%
S 120315
 
3.8%
I 106935
 
3.3%
Other values (63) 1519380
47.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 3207885
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N 282577
 
8.8%
A 247338
 
7.7%
E 179344
 
5.6%
R 175207
 
5.5%
O 164779
 
5.1%
155174
 
4.8%
C 134495
 
4.2%
U 122341
 
3.8%
S 120315
 
3.8%
I 106935
 
3.3%
Other values (63) 1519380
47.4%
Distinct52
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size22.8 MiB
2024-11-05T18:05:56.156584image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters714468
Distinct characters25
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowIL
2nd rowIL
3rd rowIL
4th rowIL
5th rowIL
ValueCountFrequency (%)
il 336201
94.1%
in 6446
 
1.8%
wi 2132
 
0.6%
mi 1643
 
0.5%
xx 959
 
0.3%
oh 888
 
0.2%
tx 835
 
0.2%
fl 809
 
0.2%
az 743
 
0.2%
ia 689
 
0.2%
Other values (42) 5889
 
1.6%
2024-11-05T18:05:56.419266image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 347166
48.6%
L 337231
47.2%
N 8328
 
1.2%
M 3272
 
0.5%
A 3261
 
0.5%
X 2753
 
0.4%
W 2284
 
0.3%
O 1980
 
0.3%
T 1429
 
0.2%
C 953
 
0.1%
Other values (15) 5811
 
0.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 714468
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
I 347166
48.6%
L 337231
47.2%
N 8328
 
1.2%
M 3272
 
0.5%
A 3261
 
0.5%
X 2753
 
0.4%
W 2284
 
0.3%
O 1980
 
0.3%
T 1429
 
0.2%
C 953
 
0.1%
Other values (15) 5811
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 714468
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
I 347166
48.6%
L 337231
47.2%
N 8328
 
1.2%
M 3272
 
0.5%
A 3261
 
0.5%
X 2753
 
0.4%
W 2284
 
0.3%
O 1980
 
0.3%
T 1429
 
0.2%
C 953
 
0.1%
Other values (15) 5811
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 714468
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
I 347166
48.6%
L 337231
47.2%
N 8328
 
1.2%
M 3272
 
0.5%
A 3261
 
0.5%
X 2753
 
0.4%
W 2284
 
0.3%
O 1980
 
0.3%
T 1429
 
0.2%
C 953
 
0.1%
Other values (15) 5811
 
0.8%

VEHICLE_YEAR
Real number (ℝ)

Skewed 

Distinct130
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2013.8658
Minimum1900
Maximum9999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.5 MiB
2024-11-05T18:05:56.559623image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1900
5-th percentile1999
Q12005
median2011
Q32014
95-th percentile2017
Maximum9999
Range8099
Interquartile range (IQR)9

Descriptive statistics

Standard deviation185.95222
Coefficient of variation (CV)0.092335952
Kurtosis1825.8722
Mean2013.8658
Median Absolute Deviation (MAD)4
Skewness42.64083
Sum7.1942134 × 108
Variance34578.227
MonotonicityNot monotonic
2024-11-05T18:05:56.718529image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2015 29738
 
8.3%
2014 27532
 
7.7%
2016 26667
 
7.5%
2013 25284
 
7.1%
2012 21984
 
6.2%
2017 20675
 
5.8%
2007 19094
 
5.3%
2011 18030
 
5.0%
2008 17931
 
5.0%
2006 17463
 
4.9%
Other values (120) 132836
37.2%
ValueCountFrequency (%)
1900 118
< 0.1%
1901 8
 
< 0.1%
1905 1
 
< 0.1%
1911 1
 
< 0.1%
1941 1
 
< 0.1%
1951 1
 
< 0.1%
1952 2
 
< 0.1%
1960 2
 
< 0.1%
1961 1
 
< 0.1%
1962 1
 
< 0.1%
ValueCountFrequency (%)
9999 192
0.1%
6043 1
 
< 0.1%
5015 1
 
< 0.1%
5012 1
 
< 0.1%
5007 1
 
< 0.1%
3023 1
 
< 0.1%
3017 1
 
< 0.1%
3016 1
 
< 0.1%
3013 7
 
< 0.1%
3012 1
 
< 0.1%

VEHICLE_DEFECT
Categorical

Imbalance 

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size23.8 MiB
NONE
245361 
UNKNOWN
108387 
OTHER
 
1406
BRAKES
 
1373
TIRES
 
202
Other values (12)
 
505

Length

Max length16
Median length4
Mean length4.9284503
Min length4

Characters and Unicode

Total characters1760610
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNONE
2nd rowNONE
3rd rowNONE
4th rowNONE
5th rowNONE

Common Values

ValueCountFrequency (%)
NONE 245361
68.7%
UNKNOWN 108387
30.3%
OTHER 1406
 
0.4%
BRAKES 1373
 
0.4%
TIRES 202
 
0.1%
STEERING 189
 
0.1%
WHEELS 104
 
< 0.1%
SUSPENSION 58
 
< 0.1%
ENGINE/MOTOR 42
 
< 0.1%
FUEL SYSTEM 29
 
< 0.1%
Other values (7) 83
 
< 0.1%

Length

2024-11-05T18:05:56.869388image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none 245361
68.7%
unknown 108387
30.3%
other 1406
 
0.4%
brakes 1373
 
0.4%
tires 202
 
0.1%
steering 189
 
0.1%
wheels 104
 
< 0.1%
suspension 58
 
< 0.1%
engine/motor 42
 
< 0.1%
system 38
 
< 0.1%
Other values (9) 115
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N 816312
46.4%
O 355333
20.2%
E 249154
 
14.2%
K 109760
 
6.2%
W 108537
 
6.2%
U 108482
 
6.2%
R 3247
 
0.2%
S 2192
 
0.1%
T 1930
 
0.1%
H 1542
 
0.1%
Other values (14) 4121
 
0.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1760610
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N 816312
46.4%
O 355333
20.2%
E 249154
 
14.2%
K 109760
 
6.2%
W 108537
 
6.2%
U 108482
 
6.2%
R 3247
 
0.2%
S 2192
 
0.1%
T 1930
 
0.1%
H 1542
 
0.1%
Other values (14) 4121
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1760610
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N 816312
46.4%
O 355333
20.2%
E 249154
 
14.2%
K 109760
 
6.2%
W 108537
 
6.2%
U 108482
 
6.2%
R 3247
 
0.2%
S 2192
 
0.1%
T 1930
 
0.1%
H 1542
 
0.1%
Other values (14) 4121
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1760610
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N 816312
46.4%
O 355333
20.2%
E 249154
 
14.2%
K 109760
 
6.2%
W 108537
 
6.2%
U 108482
 
6.2%
R 3247
 
0.2%
S 2192
 
0.1%
T 1930
 
0.1%
H 1542
 
0.1%
Other values (14) 4121
 
0.2%

VEHICLE_TYPE
Categorical

Imbalance 

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size26.3 MiB
PASSENGER
246201 
SPORT UTILITY VEHICLE (SUV)
49366 
VAN/MINI-VAN
 
20032
PICKUP
 
10144
UNKNOWN/NA
 
10043
Other values (12)
 
21448

Length

Max length27
Median length9
Mean length12.082845
Min length5

Characters and Unicode

Total characters4316403
Distinct characters32
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSPORT UTILITY VEHICLE (SUV)
2nd rowSPORT UTILITY VEHICLE (SUV)
3rd rowPASSENGER
4th rowPASSENGER
5th rowPASSENGER

Common Values

ValueCountFrequency (%)
PASSENGER 246201
68.9%
SPORT UTILITY VEHICLE (SUV) 49366
 
13.8%
VAN/MINI-VAN 20032
 
5.6%
PICKUP 10144
 
2.8%
UNKNOWN/NA 10043
 
2.8%
TRUCK - SINGLE UNIT 7007
 
2.0%
BUS OVER 15 PASS. 4400
 
1.2%
OTHER 3791
 
1.1%
TRACTOR W/ SEMI-TRAILER 3432
 
1.0%
BUS UP TO 15 PASS. 813
 
0.2%
Other values (7) 2005
 
0.6%

Length

2024-11-05T18:05:56.999204image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
passenger 246201
44.4%
vehicle 49926
 
9.0%
sport 49366
 
8.9%
utility 49366
 
8.9%
suv 49366
 
8.9%
van/mini-van 20032
 
3.6%
pickup 10144
 
1.8%
unknown/na 10043
 
1.8%
truck 7007
 
1.3%
7007
 
1.3%
Other values (25) 55512
 
10.0%

Most occurring characters

ValueCountFrequency (%)
E 618185
14.3%
S 617677
14.3%
N 360683
 
8.4%
R 329724
 
7.6%
P 321885
 
7.5%
A 310037
 
7.2%
G 253208
 
5.9%
I 221960
 
5.1%
196736
 
4.6%
T 181050
 
4.2%
Other values (22) 905258
21.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 4316403
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
E 618185
14.3%
S 617677
14.3%
N 360683
 
8.4%
R 329724
 
7.6%
P 321885
 
7.5%
A 310037
 
7.2%
G 253208
 
5.9%
I 221960
 
5.1%
196736
 
4.6%
T 181050
 
4.2%
Other values (22) 905258
21.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 4316403
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
E 618185
14.3%
S 617677
14.3%
N 360683
 
8.4%
R 329724
 
7.6%
P 321885
 
7.5%
A 310037
 
7.2%
G 253208
 
5.9%
I 221960
 
5.1%
196736
 
4.6%
T 181050
 
4.2%
Other values (22) 905258
21.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 4316403
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
E 618185
14.3%
S 617677
14.3%
N 360683
 
8.4%
R 329724
 
7.6%
P 321885
 
7.5%
A 310037
 
7.2%
G 253208
 
5.9%
I 221960
 
5.1%
196736
 
4.6%
T 181050
 
4.2%
Other values (22) 905258
21.0%

VEHICLE_USE
Categorical

Imbalance 

Distinct25
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size25.1 MiB
PERSONAL
269825 
UNKNOWN/NA
39358 
OTHER
 
11072
TAXI/FOR HIRE
 
10490
COMMERCIAL - SINGLE UNIT
 
5152
Other values (20)
 
21337

Length

Max length28
Median length8
Mean length8.8046854
Min length3

Characters and Unicode

Total characters3145333
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPERSONAL
2nd rowPERSONAL
3rd rowPERSONAL
4th rowPERSONAL
5th rowPERSONAL

Common Values

ValueCountFrequency (%)
PERSONAL 269825
75.5%
UNKNOWN/NA 39358
 
11.0%
OTHER 11072
 
3.1%
TAXI/FOR HIRE 10490
 
2.9%
COMMERCIAL - SINGLE UNIT 5152
 
1.4%
RIDESHARE SERVICE 3760
 
1.1%
OTHER TRANSIT 2884
 
0.8%
NOT IN USE 2416
 
0.7%
CTA 2310
 
0.6%
POLICE 2183
 
0.6%
Other values (15) 7784
 
2.2%

Length

2024-11-05T18:05:57.126527image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
personal 269825
67.2%
unknown/na 39358
 
9.8%
other 13956
 
3.5%
taxi/for 10490
 
2.6%
hire 10490
 
2.6%
7016
 
1.7%
commercial 6988
 
1.7%
single 5177
 
1.3%
unit 5177
 
1.3%
rideshare 3760
 
0.9%
Other values (28) 29551
 
7.4%

Most occurring characters

ValueCountFrequency (%)
N 458887
14.6%
O 353147
11.2%
A 342746
10.9%
E 331864
10.6%
R 330104
10.5%
S 294690
9.4%
L 288237
9.2%
P 272216
8.7%
I 62365
 
2.0%
U 55408
 
1.8%
Other values (16) 355669
11.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 3145333
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N 458887
14.6%
O 353147
11.2%
A 342746
10.9%
E 331864
10.6%
R 330104
10.5%
S 294690
9.4%
L 288237
9.2%
P 272216
8.7%
I 62365
 
2.0%
U 55408
 
1.8%
Other values (16) 355669
11.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 3145333
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N 458887
14.6%
O 353147
11.2%
A 342746
10.9%
E 331864
10.6%
R 330104
10.5%
S 294690
9.4%
L 288237
9.2%
P 272216
8.7%
I 62365
 
2.0%
U 55408
 
1.8%
Other values (16) 355669
11.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 3145333
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N 458887
14.6%
O 353147
11.2%
A 342746
10.9%
E 331864
10.6%
R 330104
10.5%
S 294690
9.4%
L 288237
9.2%
P 272216
8.7%
I 62365
 
2.0%
U 55408
 
1.8%
Other values (16) 355669
11.3%

TRAVEL_DIRECTION
Categorical

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size22.6 MiB
N
88375 
S
85431 
W
79472 
E
77536 
UNKNOWN
 
8439
Other values (4)
17981 

Length

Max length7
Median length1
Mean length1.192073
Min length1

Characters and Unicode

Total characters425849
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowS
2nd rowE
3rd rowN
4th rowS
5th rowE

Common Values

ValueCountFrequency (%)
N 88375
24.7%
S 85431
23.9%
W 79472
22.2%
E 77536
21.7%
UNKNOWN 8439
 
2.4%
SE 5314
 
1.5%
NW 4708
 
1.3%
SW 3999
 
1.1%
NE 3960
 
1.1%

Length

2024-11-05T18:05:57.255339image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-05T18:05:57.389162image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
n 88375
24.7%
s 85431
23.9%
w 79472
22.2%
e 77536
21.7%
unknown 8439
 
2.4%
se 5314
 
1.5%
nw 4708
 
1.3%
sw 3999
 
1.1%
ne 3960
 
1.1%

Most occurring characters

ValueCountFrequency (%)
N 122360
28.7%
W 96618
22.7%
S 94744
22.2%
E 86810
20.4%
U 8439
 
2.0%
K 8439
 
2.0%
O 8439
 
2.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 425849
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N 122360
28.7%
W 96618
22.7%
S 94744
22.2%
E 86810
20.4%
U 8439
 
2.0%
K 8439
 
2.0%
O 8439
 
2.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 425849
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N 122360
28.7%
W 96618
22.7%
S 94744
22.2%
E 86810
20.4%
U 8439
 
2.0%
K 8439
 
2.0%
O 8439
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 425849
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N 122360
28.7%
W 96618
22.7%
S 94744
22.2%
E 86810
20.4%
U 8439
 
2.0%
K 8439
 
2.0%
O 8439
 
2.0%

MANEUVER
Categorical

High correlation 

Distinct27
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size27.0 MiB
STRAIGHT AHEAD
193521 
SLOW/STOP IN TRAFFIC
37310 
TURNING LEFT
24993 
BACKING
 
18299
TURNING RIGHT
 
14060
Other values (22)
69051 

Length

Max length34
Median length14
Mean length14.39171
Min length5

Characters and Unicode

Total characters5141208
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSTRAIGHT AHEAD
2nd rowSTRAIGHT AHEAD
3rd rowSTRAIGHT AHEAD
4th rowSTRAIGHT AHEAD
5th rowCHANGING LANES

Common Values

ValueCountFrequency (%)
STRAIGHT AHEAD 193521
54.2%
SLOW/STOP IN TRAFFIC 37310
 
10.4%
TURNING LEFT 24993
 
7.0%
BACKING 18299
 
5.1%
TURNING RIGHT 14060
 
3.9%
UNKNOWN/NA 11352
 
3.2%
PASSING/OVERTAKING 9238
 
2.6%
CHANGING LANES 8983
 
2.5%
OTHER 6741
 
1.9%
ENTERING TRAFFIC LANE FROM PARKING 5383
 
1.5%
Other values (17) 27354
 
7.7%

Length

2024-11-05T18:05:57.549567image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
straight 193521
26.3%
ahead 193521
26.3%
traffic 48255
 
6.6%
slow/stop 42568
 
5.8%
in 40484
 
5.5%
turning 39196
 
5.3%
left 27467
 
3.7%
backing 18299
 
2.5%
right 15576
 
2.1%
unknown/na 11352
 
1.5%
Other values (34) 105080
14.3%

Most occurring characters

ValueCountFrequency (%)
A 728445
14.2%
T 602618
11.7%
H 420328
8.2%
I 415567
8.1%
378085
 
7.4%
R 356877
 
6.9%
G 331248
 
6.4%
S 318640
 
6.2%
E 289309
 
5.6%
N 277762
 
5.4%
Other values (16) 1022329
19.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 5141208
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
A 728445
14.2%
T 602618
11.7%
H 420328
8.2%
I 415567
8.1%
378085
 
7.4%
R 356877
 
6.9%
G 331248
 
6.4%
S 318640
 
6.2%
E 289309
 
5.6%
N 277762
 
5.4%
Other values (16) 1022329
19.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 5141208
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
A 728445
14.2%
T 602618
11.7%
H 420328
8.2%
I 415567
8.1%
378085
 
7.4%
R 356877
 
6.9%
G 331248
 
6.4%
S 318640
 
6.2%
E 289309
 
5.6%
N 277762
 
5.4%
Other values (16) 1022329
19.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 5141208
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
A 728445
14.2%
T 602618
11.7%
H 420328
8.2%
I 415567
8.1%
378085
 
7.4%
R 356877
 
6.9%
G 331248
 
6.4%
S 318640
 
6.2%
E 289309
 
5.6%
N 277762
 
5.4%
Other values (16) 1022329
19.9%

OCCUPANT_CNT
Real number (ℝ)

Distinct39
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.2815661
Minimum0
Maximum60
Zeros8
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size5.5 MiB
2024-11-05T18:05:57.676397image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median1
Q31
95-th percentile3
Maximum60
Range60
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.7764914
Coefficient of variation (CV)0.60589257
Kurtosis351.00878
Mean1.2815661
Median Absolute Deviation (MAD)0
Skewness10.10007
Sum457819
Variance0.60293889
MonotonicityNot monotonic
2024-11-05T18:05:57.811726image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
1 290494
81.3%
2 46154
 
12.9%
3 12849
 
3.6%
4 5118
 
1.4%
5 1707
 
0.5%
6 461
 
0.1%
7 184
 
0.1%
8 76
 
< 0.1%
9 39
 
< 0.1%
11 26
 
< 0.1%
Other values (29) 126
 
< 0.1%
ValueCountFrequency (%)
0 8
 
< 0.1%
1 290494
81.3%
2 46154
 
12.9%
3 12849
 
3.6%
4 5118
 
1.4%
5 1707
 
0.5%
6 461
 
0.1%
7 184
 
0.1%
8 76
 
< 0.1%
9 39
 
< 0.1%
ValueCountFrequency (%)
60 1
< 0.1%
44 1
< 0.1%
43 1
< 0.1%
41 1
< 0.1%
39 2
< 0.1%
37 1
< 0.1%
36 2
< 0.1%
35 2
< 0.1%
34 1
< 0.1%
33 2
< 0.1%
Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.8 MiB
FRONT
80274 
REAR
60846 
FRONT-RIGHT
52156 
FRONT-LEFT
49758 
SIDE-RIGHT
26221 
Other values (9)
87979 

Length

Max length17
Median length14
Mean length7.7099492
Min length4

Characters and Unicode

Total characters2754256
Distinct characters21
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFRONT-RIGHT
2nd rowFRONT-LEFT
3rd rowSIDE-LEFT
4th rowSIDE-LEFT
5th rowSIDE-RIGHT

Common Values

ValueCountFrequency (%)
FRONT 80274
22.5%
REAR 60846
17.0%
FRONT-RIGHT 52156
14.6%
FRONT-LEFT 49758
13.9%
SIDE-RIGHT 26221
 
7.3%
SIDE-LEFT 23064
 
6.5%
REAR-LEFT 22898
 
6.4%
REAR-RIGHT 21100
 
5.9%
UNKNOWN 12276
 
3.4%
NONE 4032
 
1.1%
Other values (4) 4609
 
1.3%

Length

2024-11-05T18:05:57.957572image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
front 80274
22.2%
rear 60846
16.8%
front-right 52156
14.4%
front-left 49758
13.8%
side-right 26221
 
7.3%
side-left 23064
 
6.4%
rear-left 22898
 
6.3%
rear-right 21100
 
5.8%
unknown 12276
 
3.4%
none 4032
 
1.1%
Other values (7) 8740
 
2.4%

Most occurring characters

ValueCountFrequency (%)
R 497312
18.1%
T 382469
13.9%
F 278486
10.1%
E 258587
9.4%
N 227755
8.3%
O 203008
7.4%
- 195197
 
7.1%
I 149437
 
5.4%
A 113106
 
4.1%
H 101105
 
3.7%
Other values (11) 347794
12.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2754256
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
R 497312
18.1%
T 382469
13.9%
F 278486
10.1%
E 258587
9.4%
N 227755
8.3%
O 203008
7.4%
- 195197
 
7.1%
I 149437
 
5.4%
A 113106
 
4.1%
H 101105
 
3.7%
Other values (11) 347794
12.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2754256
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
R 497312
18.1%
T 382469
13.9%
F 278486
10.1%
E 258587
9.4%
N 227755
8.3%
O 203008
7.4%
- 195197
 
7.1%
I 149437
 
5.4%
A 113106
 
4.1%
H 101105
 
3.7%
Other values (11) 347794
12.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2754256
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
R 497312
18.1%
T 382469
13.9%
F 278486
10.1%
E 258587
9.4%
N 227755
8.3%
O 203008
7.4%
- 195197
 
7.1%
I 149437
 
5.4%
A 113106
 
4.1%
H 101105
 
3.7%
Other values (11) 347794
12.6%

Anno
Real number (ℝ)

High correlation 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2017.2293
Minimum2014
Maximum2019
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.1 MiB
2024-11-05T18:05:58.070655image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum2014
5-th percentile2016
Q12017
median2017
Q32018
95-th percentile2018
Maximum2019
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.87329284
Coefficient of variation (CV)0.00043291699
Kurtosis-0.27791011
Mean2017.2293
Median Absolute Deviation (MAD)1
Skewness-0.71394876
Sum7.206229 × 108
Variance0.76264038
MonotonicityDecreasing
2024-11-05T18:05:58.184465image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2018 162373
45.5%
2017 117300
32.8%
2016 60477
 
16.9%
2015 13526
 
3.8%
2019 3550
 
1.0%
2014 8
 
< 0.1%
ValueCountFrequency (%)
2014 8
 
< 0.1%
2015 13526
 
3.8%
2016 60477
 
16.9%
2017 117300
32.8%
2018 162373
45.5%
2019 3550
 
1.0%
ValueCountFrequency (%)
2019 3550
 
1.0%
2018 162373
45.5%
2017 117300
32.8%
2016 60477
 
16.9%
2015 13526
 
3.8%
2014 8
 
< 0.1%

Mese
Real number (ℝ)

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.0975327
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.1 MiB
2024-11-05T18:05:58.295225image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median8
Q310
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.4564858
Coefficient of variation (CV)0.48699822
Kurtosis-1.1541951
Mean7.0975327
Median Absolute Deviation (MAD)3
Skewness-0.24754956
Sum2535480
Variance11.947294
MonotonicityNot monotonic
2024-11-05T18:05:58.405514image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
10 38676
10.8%
12 37932
10.6%
11 36325
10.2%
9 34367
9.6%
8 31415
8.8%
7 27762
7.8%
5 27481
7.7%
6 27218
7.6%
1 26570
7.4%
4 24371
6.8%
Other values (2) 45117
12.6%
ValueCountFrequency (%)
1 26570
7.4%
2 21114
5.9%
3 24003
6.7%
4 24371
6.8%
5 27481
7.7%
6 27218
7.6%
7 27762
7.8%
8 31415
8.8%
9 34367
9.6%
10 38676
10.8%
ValueCountFrequency (%)
12 37932
10.6%
11 36325
10.2%
10 38676
10.8%
9 34367
9.6%
8 31415
8.8%
7 27762
7.8%
6 27218
7.6%
5 27481
7.7%
4 24371
6.8%
3 24003
6.7%

Giorno
Real number (ℝ)

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.527167
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.1 MiB
2024-11-05T18:05:58.521821image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q18
median15
Q323
95-th percentile30
Maximum31
Range30
Interquartile range (IQR)15

Descriptive statistics

Standard deviation8.80583
Coefficient of variation (CV)0.56712406
Kurtosis-1.1801433
Mean15.527167
Median Absolute Deviation (MAD)8
Skewness0.042023058
Sum5546832
Variance77.542643
MonotonicityNot monotonic
2024-11-05T18:05:58.652661image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
10 12457
 
3.5%
1 12342
 
3.5%
5 12339
 
3.5%
13 12319
 
3.4%
2 12168
 
3.4%
14 12167
 
3.4%
17 12158
 
3.4%
3 12085
 
3.4%
6 11962
 
3.3%
16 11896
 
3.3%
Other values (21) 235341
65.9%
ValueCountFrequency (%)
1 12342
3.5%
2 12168
3.4%
3 12085
3.4%
4 11877
3.3%
5 12339
3.5%
6 11962
3.3%
7 11863
3.3%
8 11500
3.2%
9 11659
3.3%
10 12457
3.5%
ValueCountFrequency (%)
31 7237
2.0%
30 10746
3.0%
29 10825
3.0%
28 11119
3.1%
27 11143
3.1%
26 11182
3.1%
25 10648
3.0%
24 10822
3.0%
23 11770
3.3%
22 11186
3.1%

Interactions

2024-11-05T18:05:48.787641image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:40.826565image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:42.194631image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:43.363171image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:44.395538image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:45.496588image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:46.558244image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:47.730801image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:48.922032image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:41.041999image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:42.326506image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:43.488516image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:44.531891image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:45.627449image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:46.681087image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:47.870747image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:49.055432image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:41.286652image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:42.462875image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:43.623902image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:44.672765image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:45.767341image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:46.945383image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:48.021803image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:49.179289image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:41.472915image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:42.595385image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:43.746240image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:44.809137image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:45.904286image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:47.094491image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:48.159292image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:49.318197image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:41.665630image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:42.846653image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:43.887635image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:44.948521image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:46.045339image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:47.229342image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:48.294658image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:49.443060image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:41.811079image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:42.977531image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:44.015479image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:45.083885image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:46.175174image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:47.356199image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:48.418067image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:49.569406image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:41.937431image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:43.105405image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:44.140817image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:45.212734image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:46.299517image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:47.477554image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:48.538427image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:49.689220image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:42.064275image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:43.228267image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:44.263154image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:45.344097image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:46.429380image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:47.602934image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T18:05:48.660757image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Correlations

2024-11-05T18:05:58.759487image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
AnnoCRASH_UNIT_IDFIRST_CONTACT_POINTGiornoMANEUVERMeseOCCUPANT_CNTTRAVEL_DIRECTIONUNIT_NOUNIT_TYPEVEHICLE_DEFECTVEHICLE_IDVEHICLE_TYPEVEHICLE_USEVEHICLE_YEAR
Anno1.0000.9300.024-0.0380.024-0.2030.0240.014-0.0120.0060.0140.9300.0260.0430.120
CRASH_UNIT_ID0.9301.0000.023-0.0020.0240.1440.0250.014-0.0110.0060.0131.0000.0260.0350.126
FIRST_CONTACT_POINT0.0240.0231.0000.0030.1680.0080.0070.0530.1630.0720.0600.0230.0870.0920.025
Giorno-0.038-0.0020.0031.0000.0040.027-0.0020.004-0.0000.0000.003-0.0020.0040.003-0.001
MANEUVER0.0240.0240.1680.0041.0000.0140.0070.1280.1550.5900.0550.0240.0610.0800.024
Mese-0.2030.1440.0080.0270.0141.0000.0020.0080.0010.0030.0040.1440.0140.0100.014
OCCUPANT_CNT0.0240.0250.007-0.0020.0070.0021.0000.0000.0950.0000.0160.0250.0640.0740.015
TRAVEL_DIRECTION0.0140.0140.0530.0040.1280.0080.0001.0000.0220.0290.0260.0140.0330.0330.021
UNIT_NO-0.012-0.0110.163-0.0000.1550.0010.0950.0221.0000.0500.081-0.0110.0440.0830.121
UNIT_TYPE0.0060.0060.0720.0000.5900.0030.0000.0290.0501.0000.0150.0060.0160.1370.000
VEHICLE_DEFECT0.0140.0130.0600.0030.0550.0040.0160.0260.0810.0151.0000.0130.0600.1080.020
VEHICLE_ID0.9301.0000.023-0.0020.0240.1440.0250.014-0.0110.0060.0131.0000.0250.0350.126
VEHICLE_TYPE0.0260.0260.0870.0040.0610.0140.0640.0330.0440.0160.0600.0251.0000.3260.042
VEHICLE_USE0.0430.0350.0920.0030.0800.0100.0740.0330.0830.1370.1080.0350.3261.0000.022
VEHICLE_YEAR0.1200.1260.025-0.0010.0240.0140.0150.0210.1210.0000.0200.1260.0420.0221.000

Missing values

2024-11-05T18:05:49.974538image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-11-05T18:05:50.711000image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

CRASH_UNIT_IDRD_NOCRASH_DATEUNIT_NOUNIT_TYPEVEHICLE_IDMAKEMODELLIC_PLATE_STATEVEHICLE_YEARVEHICLE_DEFECTVEHICLE_TYPEVEHICLE_USETRAVEL_DIRECTIONMANEUVEROCCUPANT_CNTFIRST_CONTACT_POINTAnnoMeseGiorno
1561563JC1136272019-01-11 23:36:002DRIVER535738.0TOYOTA MOTOR COMPANY, LTD.Highlander(beginning vehicle year 2001)IL2003.0NONESPORT UTILITY VEHICLE (SUV)PERSONALSSTRAIGHT AHEAD1.0FRONT-RIGHT2019111
2561564JC1136272019-01-11 23:36:001DRIVER535741.0FORDEXPLORERIL2001.0NONESPORT UTILITY VEHICLE (SUV)PERSONALESTRAIGHT AHEAD1.0FRONT-LEFT2019111
3561540JC1136372019-01-11 23:31:001DRIVER535714.0CHEVROLETMALIBU (CHEVELLE)IL2013.0NONEPASSENGERPERSONALNSTRAIGHT AHEAD1.0SIDE-LEFT2019111
4561541JC1136372019-01-11 23:31:002DRIVER535718.0JEEPLAREDOIL2016.0NONEPASSENGERPERSONALSSTRAIGHT AHEAD1.0SIDE-LEFT2019111
5561542JC1136302019-01-11 23:22:001DRIVER535717.0JEEPLibertyIL2015.0NONEPASSENGERPERSONALECHANGING LANES1.0SIDE-RIGHT2019111
7561528JC1136042019-01-11 23:08:002DRIVER535706.0TOYOTA MOTOR COMPANY, LTD.CAMRYIL2007.0UNKNOWNPASSENGERTAXI/FOR HIREWSTRAIGHT AHEAD1.0REAR2019111
9561514JC1135792019-01-11 22:45:002DRIVER535694.0PONTIACBONNEVILLEIL2002.0NONEPASSENGERPERSONALSAVOIDING VEHICLES/OBJECTS2.0REAR2019111
10561546JC1136172019-01-11 22:28:001DRIVER535723.0BUICKREGALIL2011.0UNKNOWNPASSENGERPERSONALNSTRAIGHT AHEAD1.0FRONT2019111
11561547JC1136172019-01-11 22:28:002DRIVER535725.0CHEVROLETUNKNOWNIL2010.0UNKNOWNPASSENGERPERSONALNSTRAIGHT AHEAD2.0REAR2019111
13561516JC1135682019-01-11 22:16:002DRIVER535695.0DODGEDARTIL2014.0UNKNOWNPASSENGERPERSONALSSTRAIGHT AHEAD1.0FRONT-RIGHT2019111
CRASH_UNIT_IDRD_NOCRASH_DATEUNIT_NOUNIT_TYPEVEHICLE_IDMAKEMODELLIC_PLATE_STATEVEHICLE_YEARVEHICLE_DEFECTVEHICLE_TYPEVEHICLE_USETRAVEL_DIRECTIONMANEUVEROCCUPANT_CNTFIRST_CONTACT_POINTAnnoMeseGiorno
460426560784JC1119922015-01-10 17:15:001DRIVER534993.0FORDUNKNOWNIL2012.0UNKNOWNUNKNOWN/NAUNKNOWN/NAUNKNOWNSTRAIGHT AHEAD1.0FRONT2015110
460427560785JC1119922015-01-10 17:15:002DRIVER535002.0CHEVROLETMALIBU (CHEVELLE)IL2009.0NONEPASSENGERPERSONALUNKNOWNSTRAIGHT AHEAD1.0REAR2015110
46042970308HZ4005182014-08-20 16:50:001DRIVER67994.0PONTIACUNKNOWNIL2002.0NONEPASSENGERPERSONALESLOW/STOP IN TRAFFIC1.0FRONT2014820
46043070309HZ4005182014-08-20 16:50:002DRIVER67999.0KIA MOTORS CORPRioIL2002.0NONEPASSENGERPERSONALESLOW/STOP IN TRAFFIC1.0REAR2014820
46043130750HZ1646892014-02-24 19:45:001DRIVER29699.0VOLVOUNKNOWNIL2004.0NONEPASSENGERPERSONALSSTRAIGHT AHEAD1.0FRONT2014224
46043230751HZ1646892014-02-24 19:45:002DRIVER29701.0CHEVROLETUNKNOWNTN2016.0NONEPASSENGERPERSONALSTURNING LEFT1.0REAR2014224
46043324495HZ1229502014-01-21 07:40:001DRIVER23633.0TOYOTA MOTOR COMPANY, LTD.COROLLAIL2005.0NONEPASSENGERNOT IN USESSTRAIGHT AHEAD1.0SIDE-LEFT2014121
46043424496HZ1229502014-01-21 07:40:002DRIVER23634.0NISSANROGUEIL2013.0NONEPASSENGERPERSONALWSTRAIGHT AHEAD1.0FRONT2014121
460435481321JB4425502014-01-18 18:14:001DRIVER460655.0MERCEDES-BENZUNKNOWNIL2016.0UNKNOWNPASSENGERUNKNOWN/NAELEAVING TRAFFIC LANE TO PARK1.0FRONT-RIGHT2014118
460436481322JB4425502014-01-18 18:14:002PARKED460661.0DODGECHARGERIL2018.0NONEPASSENGERPERSONALEPARKED1.0FRONT-LEFT2014118